Imitating Inscrutable Enemies: Learning from Stochastic Policy Observation, Retrieval and Reuse
نویسندگان
چکیده
In this paper we study the topic of CBR systems learning from observations in which those observations can be represented as stochastic policies. We describe a general framework which encompasses three steps: (1) it observes agents performing actions, elicits stochastic policies representing the agents’ strategies and retains these policies as cases. (2) The agent analyzes the environment and retrieves a suitable stochastic policy. (3) The agent then executes the retrieved stochastic policy, which results in the agent mimicking the previously observed agent. We implement our framework in a system called JuKeCB that observes and mimics players playing games. We present the results of three sets of experiments designed to evaluate our framework. The first experiment demonstrates that JuKeCB performs well when trained against a variety of fixed strategy opponents. The second experiment demonstrates that JuKeCB can also, after training, win against an opponent with a dynamic strategy. The final experiment demonstrates that JuKeCB can win against "new" opponents (i.e. opponents against which JuKeCB is untrained).
منابع مشابه
Boosting Passage Retrieval through Reuse in Question Answering
Question Answering (QA) is an emerging important field in Information Retrieval. In a QA system the archive of previous questions asked from the system makes a collection full of useful factual nuggets. This paper makes an initial attempt to investigate the reuse of facts contained in the archive of previous questions to help and gain performance in answering future related factoid questions. I...
متن کاملA Dynamic-Bayesian Network framework for modeling and evaluating learning from observation
Learning from observation (LfO), also known as learning from demonstration, studies how computers can learn to perform complex tasks by observing and thereafter imitating the performance of a human actor. Although there has been a significant amount of research in this area, there is no agreement on a unified terminology or evaluation procedure. In this paper, we present a theoretical framework...
متن کاملLearning about Recycling and Reuse through Pre-School Games
Learning about Recycling and Reuse through Pre-School Games S.M. Shobeyri, Ph.D. H. Meybodi A. Saraadipoor S. Rashidi Environmental education is the most fundamental approach to environmental protection and can begin from the pre-school level and through the use of games. To show the effectiveness of this approach, a random sample of 30 preschoolers of ages 5 and 6 was se...
متن کاملGeneralizing Movements with Information-Theoretic Stochastic Optimal Control
Stochastic Optimal Control (SOC) is typically used to plan a movement for a specific situation. While most SOC methods fail to generalize this movement plan to a new situation without re-planning, we present a SOC method that allows us to reuse the obtained policy in a new situation as the policy is more robust to slight deviations from the initial movement plan. In order to improve the robustn...
متن کاملA Dynamic Bayesian Network Framework for Learning from Observation
Learning from Observation (a.k.a. learning from demonstration) studies how computers can learn to perform complex tasks by observing and thereafter imitating the performance of an expert. Most work on learning from observation assumes that the behavior to be learned can be expressed as a state-to-action mapping. However most behaviors of interest in real applications of learning from observatio...
متن کامل